Implicit Shape Kernel for Discriminative Learning of the Hough Transform Detector

نویسندگان

  • Yimeng Zhang
  • Tsuhan Chen
چکیده

The sliding window approach has been widely used for object detection because it provides a simple way to apply object recognition techniques to the detection task. Despite its effectiveness, though, the exhaustive search makes the approach inefficient in the case of a non-trivial classifier. The branch and bound techniques [3, 4] have been proposed in recent years to avoid the exhaustive search and still find the global optimal. The Hough transform [1, 2, 5, 6, 7, 8] provides an alternative way to perform the detection task in a much more efficient manner. The Hough transform based detector has three main steps, as illustrated in Figure 1: 1) Each local patch in a test image is assigned to a codeword. 2) According to the spatial information of the codebook learned from the training images, each patch will cast weighted votes to the object locations and scales, and obtain the initial Hough image. 3) In order to tolerate shape deformation, kernel density estimation, such as Gaussian filtering or Mean-shift modes estimation is applied to the Hough image. This process gives us the final Hough image, and the peaks in the Hough image are extracted as the detection hypotheses. Figure 1 also shows the three weights we need to learn. The implicit shape model [6] puts the Hough transform into a probabilistic formulation by learning the locations of the codewords (weight 2) based on their spatial distribution in the training images. Despite the success of the implicit shape model, it has two drawbacks. First is its discrimination power. There have been several works these years [2, 7, 8] that dealt with this issue. All of these methods have significantly improved the implicit shape model. However, they only make discrimination on the codewords (weight 1 or 3 in Figure 1), while the spatial weights are learned generatively as the spatial distribution of the codewords in positive training examples (weight 2). The second drawback is that it is difficult to interpret the scores in the final Hough image, especially after the kernel density estimation, so it is difficult to tell what function the learning process is optimizing. In this paper, we propose a novel approach for learning the Hough transform. The approach puts the whole of the Hough transform into a maximum margin formulation by connecting the Hough transform with the SVM through the kernel methods. We design a kernel particularly for the Hough transform detector and call the kernel "Implicit Shape Kernel". During training, we use the kernel to train a SVM classifier, which determines the presence of the object of interest in a subwindow. During testing, we can follow the standard Hough transform process for the kernel calculations, and the final Hough image will provide the exact the output scores of the SVM at every location and scale. We briefly describe the implicit shape kernel. The kernel is illustrated in Figure 2. For each pair of regions from two examples, if they are matched to the same codeword, we calculate the similarity value of their locations relative to the object centers. The similarity value is calculated using the window function Kw in the kernel density estimation . The window function can be either a Gaussian function or an Epanechnikov function. Different functions also define different processes at the runtime of detection with the Hough transform. The kernel value K of the implicit shape kernel is the summation of the similarities of all such pairs. Let I denote an image, Ci denote a codeword entry, C( fk) denote the codeword assignment of feature fk, and xk denote the location of the feature.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deeply Optimized Hough Transform: Application to Action Segmentation

Hough-like methods (Implicite Shape Model, Hough forest, 9 ...) have been successfully applied in multiple computer vision fields like 10 object detection, tracking, skeleton extraction or human action detection. 11 However, these methods are known to generate false positives. To handle 12 this issue, several works like Max-Margin Hough Transform (MMHT) or 13 Implicit Shape Kernel (ISK) have re...

متن کامل

PRISM: PRincipled Implicit Shape Model

This paper addresses the problem of object detection by means of the Generalised Hough transform paradigm. The Implicit Shape Model (ISM) is a well-known approach based on this idea. It made this paradigm popular and has been adopted many times. Although the algorithm exhibits robust detection performance, its description, i.e. its probabilistic model, involves arguments which are unsatisfactor...

متن کامل

Learning Equivariant Functions with Matrix Valued Kernels

This paper presents a new class of matrix valued kernels that are ideally suited to learn vector valued equivariant functions. Matrix valued kernels are a natural generalization of the common notion of a kernel. We set the theoretical foundations of so called equivariant matrix valued kernels. We work out several properties of equivariant kernels, we give an interpretation of their behavior and...

متن کامل

Learning Equivariant Functions with Matrix Valued Kernels - Theory and Applications

This paper presents a new class of matrix valued kernels, which are ideally suited to learn vector valued equivariant functions. Matrix valued kernels are a natural generalization of the common notion of a kernel. We set the theoretical foundations of so called equivariant matrix valued kernels. We work out several properties of equivariant kernels, we give an interpretation of their behavior a...

متن کامل

Fully Automatic Model Creation for Object Localization utilizing the Generalized Hough Transform

An approach for automatic object localization in medical images utilizing an extended version of the generalized Hough transform (GHT) is presented. In our approach the shape model in the GHT is equipped with specific model point weights, which are used in the voting process. The weights are adjusted in a discriminative training procedure, which aims at a minimal localization error of the GHT. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010